Feature subset selection for logistic regression via mixed integer optimization
نویسندگان
چکیده
This paper concerns a method of selecting a subset of features for a logistic regression model. Information criteria, such as the Akaike information criterion and Bayesian information criterion, are employed as a goodness-offit measure. The feature subset selection problem is formulated as a mixed integer linear optimization problem, which can be solved with standard mathematical optimization software, by using a piecewise linear approximation. Computational experiments show that, in terms of solution quality, the proposed method has superiority over common stepwise methods.
منابع مشابه
Piecewise-Linear Approximation for Feature Subset Selection in a Sequential Logit Model
Abstract This paper concerns a method of selecting a subset of features for a sequential logit model. Tanaka and Nakagawa (2014) proposed a mixed integer quadratic optimization formulation for solving the problem based on a quadratic approximation of the logistic loss function. However, since there is a significant gap between the logistic loss function and its quadratic approximation, their fo...
متن کاملRegression under a Modern Optimization Lens
In the last twenty-five years (1990-2014), algorithmic advances in integer optimization combined with hardware improvements have resulted in an astonishing 200 billion factor speedup in solving mixed integer optimization (MIO) problems ([16], [85], [104]). The common mindset of MIO as theoretically elegant but practically irrelevant is no longer justified. In this thesis, we propose a methodolo...
متن کاملNetwork Intrusion Detection through Discriminative Feature Selection by Using Sparse Logistic Regression
Intrusion detection system (IDS) is a well-known and effective component of network security that provides transactions upon the network systems with security and safety. Most of earlier research has addressed difficulties such as overfitting, feature redundancy, high-dimensional features and a limited number of training samples but feature selection. We approach the problem of feature selectio...
متن کاملThe Discrete Dantzig Selector: Estimating Sparse Linear Models via Mixed Integer Linear Optimization
We propose a new high-dimensional linear regression estimator: the Discrete Dantzig Selector, which minimizes the number of nonzero regression coefficients, subject to a budget on the maximal absolute correlation between the features and residuals. We show that the estimator can be expressed as a solution to a Mixed Integer Linear Optimization (MILO) problem, a computationally tractable framewo...
متن کاملTowards Feature Selection in Networks
Traditional feature selection methods assume that the data are independent and identically distributed (i.i.d.). In real world, tremendous amounts of data are distributed in a network. Existing features selection methods are not suited for networked data because the i.i.d. assumption no longer holds. This motivates us to study feature selection in a network. In this paper, we present a supervis...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید
ثبت ناماگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید
ورودعنوان ژورنال:
- Comp. Opt. and Appl.
دوره 64 شماره
صفحات -
تاریخ انتشار 2016